perm filename PIERCE[1,JMC] blob sn#005224 filedate 1969-11-29 generic text, type T, neo UTF8
00100		This reply to J.R. Pierce's letter (this Journal  1969)
00200	in  which  he  proposes  that researchers in speech recognition
00300	give up, is motivated more by respect for Pierce's position  in
00400	the  research  support  establishment  than  by  the  arguments
00500	contained in his letter.    The points made in his  letter  are
00600	of  two kinds, sociological and scientific.  We will paraphrase
00700	and reply to these points in the order  in  which  he  presents
00800	them  and then make some sociological and scientific remarks of
00900	our own.  Anything in quotation marks is  taken  directly  from
01000	Pierce's letter.
01100	
01200		1.  "The  sort of human behavior encouraged by the lush
01300	funding of science and engineering  after  World  War  II,  and
01400	especially  after Sputnik, has been more imitated an elaborated
01500	than commented on or analyzed.  some of the  strangest  aspects
01600	of  post-Sputnik  behavior  are  exhibited  in  work  on speech
01700	recognition as well as  in  other  intriguing  fields  such  as
01800	space,  artificial  intelligence, and cybernetics." Pierce then
01900	goes on to say that the work is carried on because the  subject
02000	is  glamorous  and  one can get money for it. "To sell suckers,
02100	one uses deceit and offers glamor." No qualifications are  made
02200	and no arguments are given, so no reply is possible.
02300	
02400		2.  Next  we  have the speculation that the research in
02500	speech recognition is partly motivated by a desire to have  the
02600	computer play Turing's imitation game.  Pierce correctly points
02700	out that it is possible to fool some of the people some of  the
02800	time by a program that doesn't understand anything.
02900	
03000		3.  Next  Pierce discusses communication with computers
03100	as a motivation for speech recognition research,  beginning  by
03200	saying  that  communicating  with  computers  by speech is like
03300	controlling a car with gee and haw and bridle and reins, saying
03400	that  we  do  quite  well  with  keyboards,  cards,  tapes, and
03500	cathode-ray tubes.  Most people talk four to ten times as  fast
03600	as  they  can type so there is a large potential advantage here
03700	if it can be realized. Pierce, however, seems to be in  a  mood
03800	of not conceding anything to the enemy.
03900	
04000		4.  Pierce  thinks  the  speech recognition efforts are
04100	doomed and says so as follows:
04200		"There  are  strong  reasons  for believing that spoken
04300	English is, in general,  simply  not  recognizable  phoneme  by
04400	phoneme  or word by word, and that people recognize utterances,
04500	not because they  hear  the  phonetic  features  or  the  words
04600	distinctly,  but  because  they  have a general sense of what a
04700	conversation is about and are  able  to  guess  what  has  been
04800	said."  This  is  buttressed  by an 1899 quotation from William
04900	James who says the same thing and by anecdotes. Further: "These
05000	considerations  lead  us  to  believe  that  a general phonetic
     

00100	typewriter is simply impossible unless the  typewriter  has  an
00200	intelligence and a knowledge of language comparable to those of
00300	a native speaker of English.  This leaves a  narrower  question
00400	open.   Are more limited, satisfactory economic applications of
00500	voice control realizable?"
00600	
00700		Pierce  is  right  in  sayinWe  agree   that   phonetic
00800	information  nominally  present  in speech is often omitted and
00900	that typing what was said in standard  English  often  requires
01000	understanding.      There   are,   however,   four   mitigating
01100	circumstances:
01200			a.  It is not yet clear how much spoken English
01300	can be transcribed using only phonetic and syntactic clues.
01400			b.  It  may  be  possible to produce a phonetic
01500	transcription  of  what  was  actually   said   that   may   be
01600	comprehensible  to  humans  and  usable for various purposes by
01700	further computer processing.
01800			c. When communicating directly with a computer,
01900	all the restrictions determining what messages make sense  that
02000	are  available to a human may also be available to the program.
02100			d. Research in  the  semantics  of  English  is
02200	proceeding,  albeit  even  more  slowly than research in speech
02300	recognition, and whenever  this  research  reaches  a  suitable
02400	stage,   it  can  be  connected  with  the  speech  recognition
02500	research.
02600	
02700		5. Pierce then knocks down the  argument  that  we  may
02800	learn something about speech via speech recognition research by
02900	saying  that  the  "recognizers"  mostly   behave   like   "mad
03000	inventors"  and "untrustworthy engineers".  He does not say who
03100	the exceptions are.  Without actually conducting  a  survey  we
03200	contradict  him  by  saying that most recognizers are seriously
03300	learning about speech and that their experience in  recognition
03400	studies improve our ideas about what it is important to learn.
03500	
03600		However,  besides  learning  about speech, it turns out
03700	that there is a lot  to  be  learned  about  computer  science,
03800	namely,  how  to  develop  procedures  that  work  well in very
03900	complex circumstances.
04000	
04100		6. the results of the experiments  are  often  obscure.
04200	Regrettably,  this  is true, and the fault is often that of the
04300	experimenter.  As an experimental field,  computer  science  is
04400	quite  new,  and  the  standards  for  what  constitutes a good
04500	experiment are not  really  formulated  yet.   However,  it  is
04600	worthwhile  merely  to  exhort  people  to  plan  their  speech
04700	recognition,   game   playing,   theorem    proving,    picture
04800	description, and other experiments so that more will be learned
04900	than just whether the method worked.
05000	
     

00100		7.Pierce  concludes  with  an  estimate that 95 percent
00200	recognition has been achieved with small vocabularies  and  few
00300	speakers,  somewhat  better  with  one  speaker.     He sees no
00400	economically sound application for  this.     Our  estimate  of
00500	applicability  is  as  follows:     At  present  (Reddy  1969),
00600	vocabularies of 100 words can be recognized in real  time  with
00700	95  percent  accuracy  on a PDP-10 computer.   This is not good
00800	enough.     Given, however, the next factor of ten  in  machine
00900	performance,  programs  that  give  visual feedback of what the
01000	machine thought it heard, and the next few years improvement in
01100	present  programs,  economically useful results may be achieved
01200	in  limited  communication  circumstances  like  controlling  a
01300	time-sharing system.
01400	
01500		8.   Finally, Pierce proposes some reasonable questions
01600	for people to  ask  themselves  before  they  undertake  speech
01700	recognition research.
01800	
01900		We have included all the scientific comments we wish to
02000	make about the field of speech recognition in  our  answers  to
02100	Pierce's  points,  so  we  will conclude with some sociological
02200	remarks.
02300	
02400		1.   Pierce's characterization  of  his  letter  as  an
02500	appeal  to  people  engaged  in  speech recognition research is
02600	disingenuous.      As director  of....      at  Bell  Telephone
02700	Laboratories  he  has  killed of their speech recognition work,
02800	and  the  journalistic  style  of  his  letter   warrants   the
02900	interpretation  that  it  is  directed more at the suppliers of
03000	funds for such research.    Pierce  was  chairman  of  the  ...
03100	Committee  ...  that  did a hatchet job on language translation
03200	research based on an only slightly more scientific study of the
03300	subject.
03400